Lexicons and grammars for language processing : industrial or handcrafted products ?
نویسنده
چکیده
During the recent years, the use of linguistic data for language processing (semantic ambiguity resolution, translation...) increased progressively. Such data are now commonly called language resources. A few years ago, nearly all the language resources used for this purpose were collections of texts as the Brown Corpus and the Penn Treebank, but the use of electronic lexicons (WordNet, FrameNet, VerbNet, ComLex...) and formal grammars (TAG...) developed recently. This development is slow because of most processes of construction of lexicons and grammars are manual, whereas the construction of corpora has always been highly automated.
منابع مشابه
FreeLing 3.0: Towards Wider Multilinguality
FreeLing is an open-source multilingual language processing library providing a wide range of analyzers for several languages. It offers text processing and language annotation facilities to NLP application developers, lowering the cost of building those applications. FreeLing is customizable, extensible, and has a strong orientation to real-world applications in terms of speed and robustness. ...
متن کاملThings between Lexicon and Grammar
A number of grammar formalisms were proposed in 80’s, such as Lexical Functional Grammars, Generalized Phrase Structure Grammars, and Tree Adjoining Grammars. Those formalisms then started to put a stress on lexicon, and were called as lexicalist (or lexicalized) grammars. Representative examples of lexicalist grammars were Head-driven Phrase Structure Grammars (HPSG) and Lexicalized Tree Adjoi...
متن کاملMHSubLex: Using Metaheuristic Methods for Subjectivity Classification of Microblogs
In Web 2.0, people are free to share their experiences, views, and opinions. One of the problems that arises in web 2.0 is the sentiment analysis of texts produced by users in outlets such as Twitter. One of main the tasks of sentiment analysis is subjectivity classification. Our aim is to classify the subjectivity of Tweets. To this end, we create subjectivity lexicons in which the words into ...
متن کاملOutilex, plate-forme logicielle de traitement de textes écrits
The Outilex software platform, which will be made available to research, development and industry, comprises software components implementing all the fundamental operations of written text processing : processing without lexicons, exploitation of lexicons and grammars, language resource management. All data are structured in XML formats, and also in more compact formats, either readable or bina...
متن کاملConstraints in Computational
Research reported in this paper a) extends the familiar notions of constraints and preferences in computational semantic analysis and generation; b) adapts constraint satisfaction techniques to the requirements of natural language processing; and c) combines i) large-scale static knowledge sources (grammars, ontologies and lexicons) with ii) processing algorithms and iii) an advanced control ar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009